Overview

Brought to you by YData

Dataset statistics

 Training DataTest Data
Number of variables1211
Number of observations593994254569
Missing cells00
Missing cells (%)0.0%0.0%
Duplicate rows00
Duplicate rows (%)0.0%0.0%
Total size in memory54.4 MiB21.4 MiB
Average record size in memory96.0 B88.0 B

Variable types

 Training DataTest Data
Numeric55
Categorical76

Alerts

Training DataTest Data
credit_score is highly overall correlated with grade_subgrade and 1 other fields credit_score is highly overall correlated with grade_subgrade and 1 other fieldsHigh correlation
employment_status is highly overall correlated with loan_paid_backAlert not present in this datasetHigh correlation
grade_subgrade is highly overall correlated with credit_score grade_subgrade is highly overall correlated with credit_scoreHigh correlation
interest_rate is highly overall correlated with credit_score interest_rate is highly overall correlated with credit_scoreHigh correlation
loan_paid_back is highly overall correlated with employment_statusAlert not present in this datasetHigh correlation

Reproduction

 Training DataTest Data
Analysis started2025-11-14 17:00:20.7869962025-11-14 17:00:38.390723
Analysis finished2025-11-14 17:00:38.3809252025-11-14 17:00:45.493136
Duration17.59 seconds7.1 seconds
Software versionydata-profiling vv4.17.0ydata-profiling vv4.17.0
Download configurationconfig.jsonconfig.json

Variables

annual_income
Real number (ℝ)

 Training DataTest Data
Distinct11972867287
Distinct (%)20.2%26.4%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean48212.20348233.08
 Training DataTest Data
Minimum6002.436011.77
Maximum393381.74380653.94
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
2025-11-14T11:00:46.036786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

 Training DataTest Data
Minimum6002.436011.77
5-th percentile15450.1115447.27
Q127934.427950.3
median46557.6846528.98
Q360981.3261149.44
95-th percentile93534.6893534.48
Maximum393381.74380653.94
Range387379.31374642.17
Interquartile range (IQR)33046.9233199.14

Descriptive statistics

 Training DataTest Data
Standard deviation26711.94226719.659
Coefficient of variation (CV)0.55404940.55396957
Kurtosis7.09141267.1975949
Mean48212.20348233.08
Median Absolute Deviation (MAD)17068.917123.71
Skewness1.71950871.7210527
Sum2.8637759 × 10101.2278647 × 1010
Variance7.1352785 × 1087.1394015 × 108
MonotonicityNot monotonicNot monotonic
2025-11-14T11:00:46.250767image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
51351.71238
 
< 0.1%
25499.88227
 
< 0.1%
24113.12219
 
< 0.1%
56547.75209
 
< 0.1%
26386.33187
 
< 0.1%
28991.07185
 
< 0.1%
16077.08170
 
< 0.1%
46949.29160
 
< 0.1%
53981.9152
 
< 0.1%
52628.69146
 
< 0.1%
Other values (119718)592101
99.7%
ValueCountFrequency (%)
25499.8892
 
< 0.1%
51351.7186
 
< 0.1%
24113.1286
 
< 0.1%
56547.7584
 
< 0.1%
28991.0778
 
< 0.1%
16077.0875
 
< 0.1%
26386.3370
 
< 0.1%
52628.6964
 
< 0.1%
61519.2764
 
< 0.1%
51773.6964
 
< 0.1%
Other values (67277)253806
99.7%
ValueCountFrequency (%)
6002.431
 
< 0.1%
6008.561
 
< 0.1%
6026.313
< 0.1%
6026.471
 
< 0.1%
6026.711
 
< 0.1%
6064.781
 
< 0.1%
6071.691
 
< 0.1%
6073.151
 
< 0.1%
6074.921
 
< 0.1%
6093.551
 
< 0.1%
ValueCountFrequency (%)
6011.771
 
< 0.1%
6018.231
 
< 0.1%
6018.91
 
< 0.1%
6026.313
 
< 0.1%
6073.641
 
< 0.1%
6100.161
 
< 0.1%
6100.3213
< 0.1%
6100.331
 
< 0.1%
6105.992
 
< 0.1%
6109.873
 
< 0.1%
ValueCountFrequency (%)
6011.771
 
< 0.1%
6018.231
 
< 0.1%
6018.91
 
< 0.1%
6026.313
 
< 0.1%
6073.641
 
< 0.1%
6100.161
 
< 0.1%
6100.3213
< 0.1%
6100.331
 
< 0.1%
6105.992
 
< 0.1%
6109.873
 
< 0.1%
ValueCountFrequency (%)
6002.431
 
< 0.1%
6008.561
 
< 0.1%
6026.313
< 0.1%
6026.471
 
< 0.1%
6026.711
 
< 0.1%
6064.781
 
< 0.1%
6071.691
 
< 0.1%
6073.151
 
< 0.1%
6074.921
 
< 0.1%
6093.551
 
< 0.1%

debt_to_income_ratio
Real number (ℝ)

 Training DataTest Data
Distinct526506
Distinct (%)0.1%0.2%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean0.120695890.12058304
 Training DataTest Data
Minimum0.0110.011
Maximum0.6270.627
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
2025-11-14T11:00:46.453211image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

 Training DataTest Data
Minimum0.0110.011
5-th percentile0.0460.046
Q10.0720.072
median0.0960.096
Q30.1560.156
95-th percentile0.2590.259
Maximum0.6270.627
Range0.6160.616
Interquartile range (IQR)0.0840.084

Descriptive statistics

 Training DataTest Data
Standard deviation0.0685732590.06858229
Coefficient of variation (CV)0.568149070.56875568
Kurtosis2.335232.4084944
Mean0.120695890.12058304
Median Absolute Deviation (MAD)0.0320.032
Skewness1.40667991.4199971
Sum71692.63530696.704
Variance0.00470229180.0047035305
MonotonicityNot monotonicNot monotonic
2025-11-14T11:00:46.654955image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0911440
 
1.9%
0.09311160
 
1.9%
0.0979508
 
1.6%
0.0799099
 
1.5%
0.0948976
 
1.5%
0.0988647
 
1.5%
0.0718192
 
1.4%
0.0967715
 
1.3%
0.0637579
 
1.3%
0.0677373
 
1.2%
Other values (516)504305
84.9%
ValueCountFrequency (%)
0.0934951
 
1.9%
0.094947
 
1.9%
0.0974003
 
1.6%
0.0943870
 
1.5%
0.0793797
 
1.5%
0.0983784
 
1.5%
0.0713452
 
1.4%
0.0963384
 
1.3%
0.0633282
 
1.3%
0.0673144
 
1.2%
Other values (496)215955
84.8%
ValueCountFrequency (%)
0.011169
< 0.1%
0.01255
 
< 0.1%
0.013127
< 0.1%
0.014243
< 0.1%
0.015138
< 0.1%
0.01680
 
< 0.1%
0.017205
< 0.1%
0.018186
< 0.1%
0.01961
 
< 0.1%
0.02152
< 0.1%
ValueCountFrequency (%)
0.01177
< 0.1%
0.01220
 
< 0.1%
0.01361
< 0.1%
0.01496
< 0.1%
0.01569
< 0.1%
0.01636
 
< 0.1%
0.017109
< 0.1%
0.01883
< 0.1%
0.01922
 
< 0.1%
0.0258
< 0.1%
ValueCountFrequency (%)
0.01177
< 0.1%
0.01220
 
< 0.1%
0.01361
< 0.1%
0.01496
< 0.1%
0.01569
< 0.1%
0.01636
 
< 0.1%
0.017109
< 0.1%
0.01883
< 0.1%
0.01922
 
< 0.1%
0.0258
< 0.1%
ValueCountFrequency (%)
0.011169
0.1%
0.01255
 
< 0.1%
0.013127
< 0.1%
0.014243
0.1%
0.015138
0.1%
0.01680
 
< 0.1%
0.017205
0.1%
0.018186
0.1%
0.01961
 
< 0.1%
0.02152
0.1%

credit_score
Real number (ℝ)

 Training DataTest Data
Distinct399389
Distinct (%)0.1%0.2%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean680.91601681.03769
 Training DataTest Data
Minimum395395
Maximum849849
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
2025-11-14T11:00:46.852797image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

 Training DataTest Data
Minimum395395
5-th percentile582582
Q1646646
median682683
Q3719719
95-th percentile767767
Maximum849849
Range454454
Interquartile range (IQR)7373

Descriptive statistics

 Training DataTest Data
Standard deviation55.42495655.624118
Coefficient of variation (CV)0.081397640.081675535
Kurtosis0.095961640.10512082
Mean680.91601681.03769
Median Absolute Deviation (MAD)3636
Skewness-0.16699288-0.17167056
Sum4.0446002 × 1081.7337108 × 108
Variance3071.92573094.0425
MonotonicityNot monotonicNot monotonic
2025-11-14T11:00:47.193800image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6786526
 
1.1%
6615801
 
1.0%
6745793
 
1.0%
7085661
 
1.0%
6815635
 
0.9%
6725622
 
0.9%
6695618
 
0.9%
6855557
 
0.9%
7135544
 
0.9%
6765508
 
0.9%
Other values (389)536729
90.4%
ValueCountFrequency (%)
6782711
 
1.1%
6722531
 
1.0%
6612520
 
1.0%
6692457
 
1.0%
7082443
 
1.0%
6812442
 
1.0%
7132425
 
1.0%
6742354
 
0.9%
6762321
 
0.9%
6882304
 
0.9%
Other values (379)230061
90.4%
ValueCountFrequency (%)
3952
< 0.1%
4311
 
< 0.1%
4352
< 0.1%
4373
< 0.1%
4391
 
< 0.1%
4401
 
< 0.1%
4411
 
< 0.1%
4454
< 0.1%
4462
< 0.1%
4472
< 0.1%
ValueCountFrequency (%)
3951
 
< 0.1%
4311
 
< 0.1%
4391
 
< 0.1%
4421
 
< 0.1%
4431
 
< 0.1%
4453
< 0.1%
4471
 
< 0.1%
4491
 
< 0.1%
4532
< 0.1%
4591
 
< 0.1%
ValueCountFrequency (%)
3951
 
< 0.1%
4311
 
< 0.1%
4391
 
< 0.1%
4421
 
< 0.1%
4431
 
< 0.1%
4453
< 0.1%
4471
 
< 0.1%
4491
 
< 0.1%
4532
< 0.1%
4591
 
< 0.1%
ValueCountFrequency (%)
3952
< 0.1%
4311
 
< 0.1%
4352
< 0.1%
4373
< 0.1%
4391
 
< 0.1%
4401
 
< 0.1%
4411
 
< 0.1%
4454
< 0.1%
4462
< 0.1%
4472
< 0.1%

loan_amount
Real number (ℝ)

 Training DataTest Data
Distinct11157065199
Distinct (%)18.8%25.6%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean15020.29815016.753
 Training DataTest Data
Minimum500.09500.05
Maximum48959.9548959.26
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
2025-11-14T11:00:47.385593image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

 Training DataTest Data
Minimum500.09500.05
5-th percentile3139.373144.66
Q110279.6210248.58
median15000.2215000.22
Q318858.5818831.46
95-th percentile27139.8327124.17
Maximum48959.9548959.26
Range48459.8648459.21
Interquartile range (IQR)8578.968582.88

Descriptive statistics

 Training DataTest Data
Standard deviation6926.53066922.1652
Coefficient of variation (CV)0.461144690.46096283
Kurtosis-0.15014223-0.15334262
Mean15020.29815016.753
Median Absolute Deviation (MAD)4386.474393.43
Skewness0.207359820.20573601
Sum8.9219667 × 1093.8227999 × 109
Variance4797682647916371
MonotonicityNot monotonicNot monotonic
2025-11-14T11:00:47.591280image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12892.25412
 
0.1%
15212.88338
 
0.1%
16004.97282
 
< 0.1%
1838.88278
 
< 0.1%
17051.01255
 
< 0.1%
15011.15250
 
< 0.1%
18078.57241
 
< 0.1%
12551.14241
 
< 0.1%
18054.98237
 
< 0.1%
8146.24232
 
< 0.1%
Other values (111560)591228
99.5%
ValueCountFrequency (%)
12892.25190
 
0.1%
16004.97118
 
< 0.1%
1838.88113
 
< 0.1%
18078.57112
 
< 0.1%
15212.88109
 
< 0.1%
17051.01107
 
< 0.1%
12093.58107
 
< 0.1%
15011.15104
 
< 0.1%
17054.68102
 
< 0.1%
12093.5102
 
< 0.1%
Other values (65189)253405
99.5%
ValueCountFrequency (%)
500.091
 
< 0.1%
500.371
 
< 0.1%
500.911
 
< 0.1%
502.911
 
< 0.1%
507.411
 
< 0.1%
507.421
 
< 0.1%
507.463
< 0.1%
507.861
 
< 0.1%
508.341
 
< 0.1%
508.351
 
< 0.1%
ValueCountFrequency (%)
500.051
 
< 0.1%
502.911
 
< 0.1%
507.461
 
< 0.1%
508.511
 
< 0.1%
508.731
 
< 0.1%
5141
 
< 0.1%
514.071
 
< 0.1%
514.161
 
< 0.1%
514.42
 
< 0.1%
514.58
< 0.1%
ValueCountFrequency (%)
500.051
 
< 0.1%
502.911
 
< 0.1%
507.461
 
< 0.1%
508.511
 
< 0.1%
508.731
 
< 0.1%
5141
 
< 0.1%
514.071
 
< 0.1%
514.161
 
< 0.1%
514.42
 
< 0.1%
514.58
< 0.1%
ValueCountFrequency (%)
500.091
 
< 0.1%
500.371
 
< 0.1%
500.911
 
< 0.1%
502.911
 
< 0.1%
507.411
 
< 0.1%
507.421
 
< 0.1%
507.463
< 0.1%
507.861
 
< 0.1%
508.341
 
< 0.1%
508.351
 
< 0.1%

interest_rate
Real number (ℝ)

 Training DataTest Data
Distinct14541385
Distinct (%)0.2%0.5%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean12.35634512.352323
 Training DataTest Data
Minimum3.23.2
Maximum20.9921.29
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
2025-11-14T11:00:47.794276image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

 Training DataTest Data
Minimum3.23.2
5-th percentile9.19.07
Q110.9910.98
median12.3712.37
Q313.6813.69
95-th percentile15.7215.72
Maximum20.9921.29
Range17.7918.09
Interquartile range (IQR)2.692.71

Descriptive statistics

 Training DataTest Data
Standard deviation2.00895892.0176018
Coefficient of variation (CV)0.16258520.16333783
Kurtosis0.0597975010.055029633
Mean12.35634512.352323
Median Absolute Deviation (MAD)1.341.35
Skewness0.0499453150.043392647
Sum7339594.93144518.6
Variance4.03591594.0707169
MonotonicityNot monotonicNot monotonic
2025-11-14T11:00:48.013088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.312638
 
0.4%
12.522436
 
0.4%
13.352415
 
0.4%
12.822406
 
0.4%
12.232362
 
0.4%
11.262318
 
0.4%
11.62236
 
0.4%
13.782222
 
0.4%
12.092215
 
0.4%
12.812209
 
0.4%
Other values (1444)570537
96.1%
ValueCountFrequency (%)
12.311112
 
0.4%
12.821043
 
0.4%
12.521005
 
0.4%
13.351005
 
0.4%
11.26992
 
0.4%
12.23985
 
0.4%
11.6974
 
0.4%
13.78951
 
0.4%
12.98948
 
0.4%
12.35931
 
0.4%
Other values (1375)244623
96.1%
ValueCountFrequency (%)
3.21
 
< 0.1%
3.321
 
< 0.1%
3.661
 
< 0.1%
3.791
 
< 0.1%
3.813
< 0.1%
3.831
 
< 0.1%
3.892
< 0.1%
3.921
 
< 0.1%
3.982
< 0.1%
4.011
 
< 0.1%
ValueCountFrequency (%)
3.21
< 0.1%
3.791
< 0.1%
3.811
< 0.1%
3.971
< 0.1%
41
< 0.1%
4.111
< 0.1%
4.182
< 0.1%
4.281
< 0.1%
4.291
< 0.1%
4.31
< 0.1%
ValueCountFrequency (%)
3.21
< 0.1%
3.791
< 0.1%
3.811
< 0.1%
3.971
< 0.1%
41
< 0.1%
4.111
< 0.1%
4.182
< 0.1%
4.281
< 0.1%
4.291
< 0.1%
4.31
< 0.1%
ValueCountFrequency (%)
3.21
 
< 0.1%
3.321
 
< 0.1%
3.661
 
< 0.1%
3.791
 
< 0.1%
3.813
< 0.1%
3.831
 
< 0.1%
3.892
< 0.1%
3.921
 
< 0.1%
3.982
< 0.1%
4.011
 
< 0.1%

gender
Categorical

 Training DataTest Data
Distinct33
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
Female
306175 
Male
284091 
Other
 
3728
Female
131480 
Male
121447 
Other
 
1642

Length

 Training DataTest Data
Max length66
Median length66
Mean length5.03717885.0394117
Min length44

Characters and Unicode

 Training DataTest Data
Total characters29920541282878
Distinct characters1010
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataTest Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataTest Data
1st rowFemaleFemale
2nd rowMaleFemale
3rd rowMaleMale
4th rowFemaleFemale
5th rowMaleFemale

Common Values

ValueCountFrequency (%)
Female306175
51.5%
Male284091
47.8%
Other3728
 
0.6%
ValueCountFrequency (%)
Female131480
51.6%
Male121447
47.7%
Other1642
 
0.6%

Length

2025-11-14T11:00:48.187323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:00:48.315179image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:48.415238image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
female306175
51.5%
male284091
47.8%
other3728
 
0.6%
ValueCountFrequency (%)
female131480
51.6%
male121447
47.7%
other1642
 
0.6%

Most occurring characters

ValueCountFrequency (%)
e900169
30.1%
a590266
19.7%
l590266
19.7%
F306175
 
10.2%
m306175
 
10.2%
M284091
 
9.5%
O3728
 
0.1%
t3728
 
0.1%
h3728
 
0.1%
r3728
 
0.1%
ValueCountFrequency (%)
e386049
30.1%
a252927
19.7%
l252927
19.7%
F131480
 
10.2%
m131480
 
10.2%
M121447
 
9.5%
O1642
 
0.1%
t1642
 
0.1%
h1642
 
0.1%
r1642
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)2992054
100.0%
ValueCountFrequency (%)
(unknown)1282878
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e900169
30.1%
a590266
19.7%
l590266
19.7%
F306175
 
10.2%
m306175
 
10.2%
M284091
 
9.5%
O3728
 
0.1%
t3728
 
0.1%
h3728
 
0.1%
r3728
 
0.1%
ValueCountFrequency (%)
e386049
30.1%
a252927
19.7%
l252927
19.7%
F131480
 
10.2%
m131480
 
10.2%
M121447
 
9.5%
O1642
 
0.1%
t1642
 
0.1%
h1642
 
0.1%
r1642
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2992054
100.0%
ValueCountFrequency (%)
(unknown)1282878
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e900169
30.1%
a590266
19.7%
l590266
19.7%
F306175
 
10.2%
m306175
 
10.2%
M284091
 
9.5%
O3728
 
0.1%
t3728
 
0.1%
h3728
 
0.1%
r3728
 
0.1%
ValueCountFrequency (%)
e386049
30.1%
a252927
19.7%
l252927
19.7%
F131480
 
10.2%
m131480
 
10.2%
M121447
 
9.5%
O1642
 
0.1%
t1642
 
0.1%
h1642
 
0.1%
r1642
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2992054
100.0%
ValueCountFrequency (%)
(unknown)1282878
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e900169
30.1%
a590266
19.7%
l590266
19.7%
F306175
 
10.2%
m306175
 
10.2%
M284091
 
9.5%
O3728
 
0.1%
t3728
 
0.1%
h3728
 
0.1%
r3728
 
0.1%
ValueCountFrequency (%)
e386049
30.1%
a252927
19.7%
l252927
19.7%
F131480
 
10.2%
m131480
 
10.2%
M121447
 
9.5%
O1642
 
0.1%
t1642
 
0.1%
h1642
 
0.1%
r1642
 
0.1%

marital_status
Categorical

 Training DataTest Data
Distinct44
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
Single
288843 
Married
277239 
Divorced
 
21312
Widowed
 
6600
Single
123686 
Married
119000 
Divorced
 
9122
Widowed
 
2761

Length

 Training DataTest Data
Max length88
Median length77
Mean length6.54960666.5499688
Min length66

Characters and Unicode

 Training DataTest Data
Total characters38904271667419
Distinct characters1616
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataTest Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataTest Data
1st rowSingleSingle
2nd rowMarriedMarried
3rd rowSingleSingle
4th rowSingleSingle
5th rowMarriedMarried

Common Values

ValueCountFrequency (%)
Single288843
48.6%
Married277239
46.7%
Divorced21312
 
3.6%
Widowed6600
 
1.1%
ValueCountFrequency (%)
Single123686
48.6%
Married119000
46.7%
Divorced9122
 
3.6%
Widowed2761
 
1.1%

Length

2025-11-14T11:00:48.555480image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:00:48.702635image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:48.801920image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
single288843
48.6%
married277239
46.7%
divorced21312
 
3.6%
widowed6600
 
1.1%
ValueCountFrequency (%)
single123686
48.6%
married119000
46.7%
divorced9122
 
3.6%
widowed2761
 
1.1%

Most occurring characters

ValueCountFrequency (%)
i593994
15.3%
e593994
15.3%
r575790
14.8%
d311751
8.0%
g288843
7.4%
l288843
7.4%
n288843
7.4%
S288843
7.4%
a277239
7.1%
M277239
7.1%
Other values (6)105048
 
2.7%
ValueCountFrequency (%)
i254569
15.3%
e254569
15.3%
r247122
14.8%
d133644
8.0%
g123686
7.4%
l123686
7.4%
n123686
7.4%
S123686
7.4%
a119000
7.1%
M119000
7.1%
Other values (6)44771
 
2.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)3890427
100.0%
ValueCountFrequency (%)
(unknown)1667419
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i593994
15.3%
e593994
15.3%
r575790
14.8%
d311751
8.0%
g288843
7.4%
l288843
7.4%
n288843
7.4%
S288843
7.4%
a277239
7.1%
M277239
7.1%
Other values (6)105048
 
2.7%
ValueCountFrequency (%)
i254569
15.3%
e254569
15.3%
r247122
14.8%
d133644
8.0%
g123686
7.4%
l123686
7.4%
n123686
7.4%
S123686
7.4%
a119000
7.1%
M119000
7.1%
Other values (6)44771
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)3890427
100.0%
ValueCountFrequency (%)
(unknown)1667419
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i593994
15.3%
e593994
15.3%
r575790
14.8%
d311751
8.0%
g288843
7.4%
l288843
7.4%
n288843
7.4%
S288843
7.4%
a277239
7.1%
M277239
7.1%
Other values (6)105048
 
2.7%
ValueCountFrequency (%)
i254569
15.3%
e254569
15.3%
r247122
14.8%
d133644
8.0%
g123686
7.4%
l123686
7.4%
n123686
7.4%
S123686
7.4%
a119000
7.1%
M119000
7.1%
Other values (6)44771
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)3890427
100.0%
ValueCountFrequency (%)
(unknown)1667419
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i593994
15.3%
e593994
15.3%
r575790
14.8%
d311751
8.0%
g288843
7.4%
l288843
7.4%
n288843
7.4%
S288843
7.4%
a277239
7.1%
M277239
7.1%
Other values (6)105048
 
2.7%
ValueCountFrequency (%)
i254569
15.3%
e254569
15.3%
r247122
14.8%
d133644
8.0%
g123686
7.4%
l123686
7.4%
n123686
7.4%
S123686
7.4%
a119000
7.1%
M119000
7.1%
Other values (6)44771
 
2.7%

education_level
Categorical

 Training DataTest Data
Distinct55
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
Bachelor's
279606 
High School
183592 
Master's
93097 
Other
 
26677
PhD
 
11022
Bachelor's
119924 
High School
78687 
Master's
39826 
Other
 
11325
PhD
 
4807

Length

 Training DataTest Data
Max length1111
Median length1010
Mean length9.64117319.6415942
Min length33

Characters and Unicode

 Training DataTest Data
Total characters57267992454451
Distinct characters2020
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataTest Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataTest Data
1st rowHigh SchoolHigh School
2nd rowMaster'sMaster's
3rd rowHigh SchoolBachelor's
4th rowHigh SchoolBachelor's
5th rowHigh SchoolPhD

Common Values

ValueCountFrequency (%)
Bachelor's279606
47.1%
High School183592
30.9%
Master's93097
 
15.7%
Other26677
 
4.5%
PhD11022
 
1.9%
ValueCountFrequency (%)
Bachelor's119924
47.1%
High School78687
30.9%
Master's39826
 
15.6%
Other11325
 
4.4%
PhD4807
 
1.9%

Length

2025-11-14T11:00:48.937881image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:00:49.059179image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:49.172906image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
bachelor's279606
36.0%
high183592
23.6%
school183592
23.6%
master's93097
 
12.0%
other26677
 
3.4%
phd11022
 
1.4%
ValueCountFrequency (%)
bachelor's119924
36.0%
high78687
23.6%
school78687
23.6%
master's39826
 
12.0%
other11325
 
3.4%
phd4807
 
1.4%

Most occurring characters

ValueCountFrequency (%)
h684489
12.0%
o646790
11.3%
s465800
 
8.1%
c463198
 
8.1%
l463198
 
8.1%
e399380
 
7.0%
r399380
 
7.0%
a372703
 
6.5%
'372703
 
6.5%
B279606
 
4.9%
Other values (10)1179552
20.6%
ValueCountFrequency (%)
h293430
12.0%
o277298
11.3%
s199576
 
8.1%
c198611
 
8.1%
l198611
 
8.1%
e171075
 
7.0%
r171075
 
7.0%
a159750
 
6.5%
'159750
 
6.5%
B119924
 
4.9%
Other values (10)505351
20.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)5726799
100.0%
ValueCountFrequency (%)
(unknown)2454451
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
h684489
12.0%
o646790
11.3%
s465800
 
8.1%
c463198
 
8.1%
l463198
 
8.1%
e399380
 
7.0%
r399380
 
7.0%
a372703
 
6.5%
'372703
 
6.5%
B279606
 
4.9%
Other values (10)1179552
20.6%
ValueCountFrequency (%)
h293430
12.0%
o277298
11.3%
s199576
 
8.1%
c198611
 
8.1%
l198611
 
8.1%
e171075
 
7.0%
r171075
 
7.0%
a159750
 
6.5%
'159750
 
6.5%
B119924
 
4.9%
Other values (10)505351
20.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)5726799
100.0%
ValueCountFrequency (%)
(unknown)2454451
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
h684489
12.0%
o646790
11.3%
s465800
 
8.1%
c463198
 
8.1%
l463198
 
8.1%
e399380
 
7.0%
r399380
 
7.0%
a372703
 
6.5%
'372703
 
6.5%
B279606
 
4.9%
Other values (10)1179552
20.6%
ValueCountFrequency (%)
h293430
12.0%
o277298
11.3%
s199576
 
8.1%
c198611
 
8.1%
l198611
 
8.1%
e171075
 
7.0%
r171075
 
7.0%
a159750
 
6.5%
'159750
 
6.5%
B119924
 
4.9%
Other values (10)505351
20.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)5726799
100.0%
ValueCountFrequency (%)
(unknown)2454451
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
h684489
12.0%
o646790
11.3%
s465800
 
8.1%
c463198
 
8.1%
l463198
 
8.1%
e399380
 
7.0%
r399380
 
7.0%
a372703
 
6.5%
'372703
 
6.5%
B279606
 
4.9%
Other values (10)1179552
20.6%
ValueCountFrequency (%)
h293430
12.0%
o277298
11.3%
s199576
 
8.1%
c198611
 
8.1%
l198611
 
8.1%
e171075
 
7.0%
r171075
 
7.0%
a159750
 
6.5%
'159750
 
6.5%
B119924
 
4.9%
Other values (10)505351
20.6%

employment_status
Categorical

 Training DataTest Data
Distinct55
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
Employed
450645 
Unemployed
62485 
Self-employed
52480 
Retired
 
16453
Student
 
11931
Employed
193207 
Unemployed
26715 
Self-employed
22543 
Retired
 
7060
Student
 
5044

Length

 Training DataTest Data
Max length1313
Median length88
Mean length8.60435968.6051051
Min length77

Characters and Unicode

 Training DataTest Data
Total characters51109382190593
Distinct characters1818
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataTest Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataTest Data
1st rowSelf-employedEmployed
2nd rowEmployedEmployed
3rd rowEmployedEmployed
4th rowEmployedEmployed
5th rowEmployedEmployed

Common Values

ValueCountFrequency (%)
Employed450645
75.9%
Unemployed62485
 
10.5%
Self-employed52480
 
8.8%
Retired16453
 
2.8%
Student11931
 
2.0%
ValueCountFrequency (%)
Employed193207
75.9%
Unemployed26715
 
10.5%
Self-employed22543
 
8.9%
Retired7060
 
2.8%
Student5044
 
2.0%

Length

2025-11-14T11:00:49.304451image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:00:49.435517image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:49.557517image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
employed450645
75.9%
unemployed62485
 
10.5%
self-employed52480
 
8.8%
retired16453
 
2.8%
student11931
 
2.0%
ValueCountFrequency (%)
employed193207
75.9%
unemployed26715
 
10.5%
self-employed22543
 
8.9%
retired7060
 
2.8%
student5044
 
2.0%

Most occurring characters

ValueCountFrequency (%)
e777892
15.2%
l618090
12.1%
d593994
11.6%
m565610
11.1%
y565610
11.1%
p565610
11.1%
o565610
11.1%
E450645
8.8%
n74416
 
1.5%
S64411
 
1.3%
Other values (8)269050
 
5.3%
ValueCountFrequency (%)
e333430
15.2%
l265008
12.1%
d254569
11.6%
m242465
11.1%
y242465
11.1%
p242465
11.1%
o242465
11.1%
E193207
8.8%
n31759
 
1.4%
S27587
 
1.3%
Other values (8)115173
 
5.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)5110938
100.0%
ValueCountFrequency (%)
(unknown)2190593
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e777892
15.2%
l618090
12.1%
d593994
11.6%
m565610
11.1%
y565610
11.1%
p565610
11.1%
o565610
11.1%
E450645
8.8%
n74416
 
1.5%
S64411
 
1.3%
Other values (8)269050
 
5.3%
ValueCountFrequency (%)
e333430
15.2%
l265008
12.1%
d254569
11.6%
m242465
11.1%
y242465
11.1%
p242465
11.1%
o242465
11.1%
E193207
8.8%
n31759
 
1.4%
S27587
 
1.3%
Other values (8)115173
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)5110938
100.0%
ValueCountFrequency (%)
(unknown)2190593
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e777892
15.2%
l618090
12.1%
d593994
11.6%
m565610
11.1%
y565610
11.1%
p565610
11.1%
o565610
11.1%
E450645
8.8%
n74416
 
1.5%
S64411
 
1.3%
Other values (8)269050
 
5.3%
ValueCountFrequency (%)
e333430
15.2%
l265008
12.1%
d254569
11.6%
m242465
11.1%
y242465
11.1%
p242465
11.1%
o242465
11.1%
E193207
8.8%
n31759
 
1.4%
S27587
 
1.3%
Other values (8)115173
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)5110938
100.0%
ValueCountFrequency (%)
(unknown)2190593
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e777892
15.2%
l618090
12.1%
d593994
11.6%
m565610
11.1%
y565610
11.1%
p565610
11.1%
o565610
11.1%
E450645
8.8%
n74416
 
1.5%
S64411
 
1.3%
Other values (8)269050
 
5.3%
ValueCountFrequency (%)
e333430
15.2%
l265008
12.1%
d254569
11.6%
m242465
11.1%
y242465
11.1%
p242465
11.1%
o242465
11.1%
E193207
8.8%
n31759
 
1.4%
S27587
 
1.3%
Other values (8)115173
 
5.3%

loan_purpose
Categorical

 Training DataTest Data
Distinct88
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
Debt consolidation
324695 
Other
63874 
Car
58108 
Home
44118 
Education
36641 
Other values (3)
66558 
Debt consolidation
138963 
Other
27715 
Car
24889 
Home
18984 
Education
15719 
Other values (3)
28299 

Length

 Training DataTest Data
Max length1818
Median length1818
Mean length12.3807712.368973
Min length33

Characters and Unicode

 Training DataTest Data
Total characters73541033148757
Distinct characters2424
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataTest Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataTest Data
1st rowOtherOther
2nd rowDebt consolidationOther
3rd rowDebt consolidationDebt consolidation
4th rowDebt consolidationDebt consolidation
5th rowOtherBusiness

Common Values

ValueCountFrequency (%)
Debt consolidation324695
54.7%
Other63874
 
10.8%
Car58108
 
9.8%
Home44118
 
7.4%
Education36641
 
6.2%
Business35303
 
5.9%
Medical22806
 
3.8%
Vacation8449
 
1.4%
ValueCountFrequency (%)
Debt consolidation138963
54.6%
Other27715
 
10.9%
Car24889
 
9.8%
Home18984
 
7.5%
Education15719
 
6.2%
Business15076
 
5.9%
Medical9618
 
3.8%
Vacation3605
 
1.4%

Length

2025-11-14T11:00:49.712496image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:00:49.850623image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:50.003614image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
debt324695
35.3%
consolidation324695
35.3%
other63874
 
7.0%
car58108
 
6.3%
home44118
 
4.8%
education36641
 
4.0%
business35303
 
3.8%
medical22806
 
2.5%
vacation8449
 
0.9%
ValueCountFrequency (%)
debt138963
35.3%
consolidation138963
35.3%
other27715
 
7.0%
car24889
 
6.3%
home18984
 
4.8%
education15719
 
4.0%
business15076
 
3.8%
medical9618
 
2.4%
vacation3605
 
0.9%

Most occurring characters

ValueCountFrequency (%)
o1063293
14.5%
t758354
10.3%
i752589
10.2%
n729783
9.9%
e490796
 
6.7%
a459148
 
6.2%
s430604
 
5.9%
c392591
 
5.3%
d384142
 
5.2%
l347501
 
4.7%
Other values (14)1545302
21.0%
ValueCountFrequency (%)
o455197
14.5%
t324965
10.3%
i321944
10.2%
n312326
9.9%
e210356
 
6.7%
a196399
 
6.2%
s184191
 
5.8%
c167905
 
5.3%
d164300
 
5.2%
l148581
 
4.7%
Other values (14)662593
21.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)7354103
100.0%
ValueCountFrequency (%)
(unknown)3148757
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o1063293
14.5%
t758354
10.3%
i752589
10.2%
n729783
9.9%
e490796
 
6.7%
a459148
 
6.2%
s430604
 
5.9%
c392591
 
5.3%
d384142
 
5.2%
l347501
 
4.7%
Other values (14)1545302
21.0%
ValueCountFrequency (%)
o455197
14.5%
t324965
10.3%
i321944
10.2%
n312326
9.9%
e210356
 
6.7%
a196399
 
6.2%
s184191
 
5.8%
c167905
 
5.3%
d164300
 
5.2%
l148581
 
4.7%
Other values (14)662593
21.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)7354103
100.0%
ValueCountFrequency (%)
(unknown)3148757
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o1063293
14.5%
t758354
10.3%
i752589
10.2%
n729783
9.9%
e490796
 
6.7%
a459148
 
6.2%
s430604
 
5.9%
c392591
 
5.3%
d384142
 
5.2%
l347501
 
4.7%
Other values (14)1545302
21.0%
ValueCountFrequency (%)
o455197
14.5%
t324965
10.3%
i321944
10.2%
n312326
9.9%
e210356
 
6.7%
a196399
 
6.2%
s184191
 
5.8%
c167905
 
5.3%
d164300
 
5.2%
l148581
 
4.7%
Other values (14)662593
21.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)7354103
100.0%
ValueCountFrequency (%)
(unknown)3148757
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o1063293
14.5%
t758354
10.3%
i752589
10.2%
n729783
9.9%
e490796
 
6.7%
a459148
 
6.2%
s430604
 
5.9%
c392591
 
5.3%
d384142
 
5.2%
l347501
 
4.7%
Other values (14)1545302
21.0%
ValueCountFrequency (%)
o455197
14.5%
t324965
10.3%
i321944
10.2%
n312326
9.9%
e210356
 
6.7%
a196399
 
6.2%
s184191
 
5.8%
c167905
 
5.3%
d164300
 
5.2%
l148581
 
4.7%
Other values (14)662593
21.0%

grade_subgrade
Categorical

 Training DataTest Data
Distinct3030
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB1.9 MiB
C3
58695 
C4
55957 
C2
54443 
C1
53363 
C5
53317 
Other values (25)
318219 
C3
25410 
C4
23712 
C2
23334 
C1
22814 
C5
22777 
Other values (25)
136522 

Length

 Training DataTest Data
Max length22
Median length22
Mean length22
Min length22

Characters and Unicode

 Training DataTest Data
Total characters1187988509138
Distinct characters1111
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataTest Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataTest Data
1st rowC3D5
2nd rowD3C1
3rd rowC5D1
4th rowF1C3
5th rowD1C1

Common Values

ValueCountFrequency (%)
C358695
9.9%
C455957
 
9.4%
C254443
 
9.2%
C153363
 
9.0%
C553317
 
9.0%
D137029
 
6.2%
D336694
 
6.2%
D435097
 
5.9%
D234432
 
5.8%
D532101
 
5.4%
Other values (20)142866
24.1%
ValueCountFrequency (%)
C325410
10.0%
C423712
 
9.3%
C223334
 
9.2%
C122814
 
9.0%
C522777
 
8.9%
D115721
 
6.2%
D315639
 
6.1%
D414990
 
5.9%
D214773
 
5.8%
D513923
 
5.5%
Other values (20)61476
24.1%

Length

2025-11-14T11:00:50.150596image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)

Test Data


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
c358695
9.9%
c455957
 
9.4%
c254443
 
9.2%
c153363
 
9.0%
c553317
 
9.0%
d137029
 
6.2%
d336694
 
6.2%
d435097
 
5.9%
d234432
 
5.8%
d532101
 
5.4%
Other values (20)142866
24.1%
ValueCountFrequency (%)
c325410
10.0%
c423712
 
9.3%
c223334
 
9.2%
c122814
 
9.0%
c522777
 
8.9%
d115721
 
6.2%
d315639
 
6.1%
d414990
 
5.9%
d214773
 
5.8%
d513923
 
5.5%
Other values (20)61476
24.1%

Most occurring characters

ValueCountFrequency (%)
C275775
23.2%
D175353
14.8%
3123538
10.4%
4120203
10.1%
1118761
10.0%
2117635
9.9%
5113857
9.6%
B71251
 
6.0%
E34458
 
2.9%
F27301
 
2.3%
ValueCountFrequency (%)
C118047
23.2%
D75046
14.7%
353262
10.5%
451283
10.1%
250597
9.9%
150535
9.9%
548892
9.6%
B30748
 
6.0%
E14703
 
2.9%
F11751
 
2.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)1187988
100.0%
ValueCountFrequency (%)
(unknown)509138
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
C275775
23.2%
D175353
14.8%
3123538
10.4%
4120203
10.1%
1118761
10.0%
2117635
9.9%
5113857
9.6%
B71251
 
6.0%
E34458
 
2.9%
F27301
 
2.3%
ValueCountFrequency (%)
C118047
23.2%
D75046
14.7%
353262
10.5%
451283
10.1%
250597
9.9%
150535
9.9%
548892
9.6%
B30748
 
6.0%
E14703
 
2.9%
F11751
 
2.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1187988
100.0%
ValueCountFrequency (%)
(unknown)509138
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
C275775
23.2%
D175353
14.8%
3123538
10.4%
4120203
10.1%
1118761
10.0%
2117635
9.9%
5113857
9.6%
B71251
 
6.0%
E34458
 
2.9%
F27301
 
2.3%
ValueCountFrequency (%)
C118047
23.2%
D75046
14.7%
353262
10.5%
451283
10.1%
250597
9.9%
150535
9.9%
548892
9.6%
B30748
 
6.0%
E14703
 
2.9%
F11751
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1187988
100.0%
ValueCountFrequency (%)
(unknown)509138
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
C275775
23.2%
D175353
14.8%
3123538
10.4%
4120203
10.1%
1118761
10.0%
2117635
9.9%
5113857
9.6%
B71251
 
6.0%
E34458
 
2.9%
F27301
 
2.3%
ValueCountFrequency (%)
C118047
23.2%
D75046
14.7%
353262
10.5%
451283
10.1%
250597
9.9%
150535
9.9%
548892
9.6%
B30748
 
6.0%
E14703
 
2.9%
F11751
 
2.3%

loan_paid_back
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
1.0
474494 
0.0
119500 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1781982
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0474494
79.9%
0.0119500
 
20.1%

Length

2025-11-14T11:00:50.276532image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1.0474494
79.9%
0.0119500
 
20.1%

Most occurring characters

ValueCountFrequency (%)
0713494
40.0%
.593994
33.3%
1474494
26.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)1781982
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0713494
40.0%
.593994
33.3%
1474494
26.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1781982
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0713494
40.0%
.593994
33.3%
1474494
26.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1781982
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0713494
40.0%
.593994
33.3%
1474494
26.6%

Interactions

Training Data

2025-11-14T11:00:36.131386image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:44.103006image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:32.677822image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:41.642181image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.510844image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:42.231107image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.430627image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:42.815926image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.237854image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:43.475610image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:36.299342image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:44.234324image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:32.856896image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:41.763690image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.783870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:42.354132image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.599501image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:42.937599image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.404524image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:43.612979image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:36.458797image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:44.360823image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.023954image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:41.881804image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.948581image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:42.469669image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.750144image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:43.146610image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.657355image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:43.735257image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:36.611100image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:44.481247image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.179403image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:41.998624image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.099377image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:42.582170image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.898278image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:43.250235image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.815478image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:43.841862image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:36.770721image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:44.630229image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.340234image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:42.112508image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.253960image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:42.693735image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.063562image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:43.356485image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.968511image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:43.972591image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Correlations

Training Data

2025-11-14T11:00:50.369105image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Test Data

2025-11-14T11:00:50.542396image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

annual_incomecredit_scoredebt_to_income_ratioeducation_levelemployment_statusgendergrade_subgradeinterest_rateloan_amountloan_paid_backloan_purposemarital_status
annual_income1.0000.0040.0050.0080.0090.0040.007-0.003-0.0090.0200.0070.010
credit_score0.0041.000-0.0600.0070.0510.0080.638-0.517-0.0080.2320.0080.011
debt_to_income_ratio0.005-0.0601.0000.0060.0880.0040.0240.026-0.0120.3340.0060.004
education_level0.0080.0070.0061.0000.0120.0040.0130.0080.0050.0250.0110.008
employment_status0.0090.0510.0880.0121.0000.0030.0520.0250.0100.6570.0150.006
gender0.0040.0080.0040.0040.0031.0000.0090.0040.0100.0070.0070.002
grade_subgrade0.0070.6380.0240.0130.0520.0091.0000.1920.0130.2280.0080.013
interest_rate-0.003-0.5170.0260.0080.0250.0040.1921.000-0.0010.1290.0060.006
loan_amount-0.009-0.008-0.0120.0050.0100.0100.013-0.0011.0000.0130.0080.008
loan_paid_back0.0200.2320.3340.0250.6570.0070.2280.1290.0131.0000.0250.001
loan_purpose0.0070.0080.0060.0110.0150.0070.0080.0060.0080.0251.0000.010
marital_status0.0100.0110.0040.0080.0060.0020.0130.0060.0080.0010.0101.000

Test Data

annual_incomecredit_scoredebt_to_income_ratioeducation_levelemployment_statusgendergrade_subgradeinterest_rateloan_amountloan_purposemarital_status
annual_income1.0000.005-0.0000.0100.0100.0000.008-0.006-0.0050.0070.010
credit_score0.0051.000-0.0610.0080.0530.0070.639-0.517-0.0100.0080.012
debt_to_income_ratio-0.000-0.0611.0000.0080.0870.0070.0240.027-0.0130.0060.005
education_level0.0100.0080.0081.0000.0120.0010.0110.0060.0080.0110.007
employment_status0.0100.0530.0870.0121.0000.0050.0530.0250.0080.0140.004
gender0.0000.0070.0070.0010.0051.0000.0080.0040.0080.0030.002
grade_subgrade0.0080.6390.0240.0110.0530.0081.0000.1930.0130.0070.015
interest_rate-0.006-0.5170.0270.0060.0250.0040.1931.000-0.0000.0050.006
loan_amount-0.005-0.010-0.0130.0080.0080.0080.013-0.0001.0000.0080.012
loan_purpose0.0070.0080.0060.0110.0140.0030.0070.0050.0081.0000.009
marital_status0.0100.0120.0050.0070.0040.0020.0150.0060.0120.0091.000

Missing values

Training Data

2025-11-14T11:00:37.035638image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.

Test Data

2025-11-14T11:00:44.837035image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.

Training Data

2025-11-14T11:00:37.593320image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Test Data

2025-11-14T11:00:45.153297image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Training Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgradeloan_paid_back
029,367.9900.0847362,528.42013.670FemaleSingleHigh SchoolSelf-employedOtherC31.000
122,108.0200.1666364,593.10012.920MaleMarriedMaster'sEmployedDebt consolidationD30.000
249,566.2000.09769417,005.1509.760MaleSingleHigh SchoolEmployedDebt consolidationC51.000
346,858.2500.0655334,682.48016.100FemaleSingleHigh SchoolEmployedDebt consolidationF11.000
425,496.7000.05366512,184.43010.210MaleMarriedHigh SchoolEmployedOtherD11.000
544,940.3000.05865312,159.92012.240MaleSingleBachelor'sEmployedOtherD11.000
661,574.1600.04269616,907.71013.520OtherSingleHigh SchoolSelf-employedDebt consolidationC51.000
745,953.3100.10065410,111.62012.820FemaleMarriedHigh SchoolEmployedHomeD11.000
830,592.2900.1327137,522.3609.480MaleMarriedBachelor'sEmployedEducationC51.000
917,342.4500.1215489,653.48016.040FemaleMarriedBachelor'sSelf-employedVacationF11.000

Test Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgrade
028,781.0500.04962611,461.42014.730FemaleSingleHigh SchoolEmployedOtherD5
146,626.3900.09373215,492.25012.850FemaleMarriedMaster'sEmployedOtherC1
254,954.8900.3676113,796.41013.290MaleSingleBachelor'sEmployedDebt consolidationD1
325,644.6300.1106716,574.3009.570FemaleSingleBachelor'sEmployedDebt consolidationC3
425,169.6400.08168817,696.89012.800FemaleMarriedPhDEmployedBusinessC1
545,302.9000.0606758,106.78013.740FemaleMarriedHigh SchoolEmployedVacationC3
627,676.4700.0617148,242.26013.870FemaleSingleHigh SchoolEmployedDebt consolidationC4
738,216.9100.0957193,765.50015.100MaleSingleHigh SchoolEmployedOtherC5
825,650.5900.10166420,310.64011.740MaleSingleHigh SchoolEmployedEducationD4
962,497.0300.2076515,177.58013.900FemaleDivorcedHigh SchoolUnemployedCarD2

Training Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgradeloan_paid_back
59398436,169.3400.0916769,986.83014.180FemaleMarriedBachelor'sRetiredDebt consolidationC31.000
59398537,188.4300.17071817,056.52010.470FemaleMarriedBachelor'sEmployedHomeC31.000
59398625,015.3500.07463315,922.61013.910MaleMarriedBachelor'sEmployedDebt consolidationD20.000
59398717,662.6800.07467919,792.92015.480FemaleSingleOtherEmployedDebt consolidationC31.000
59398815,602.2200.05662225,706.47015.750FemaleMarriedHigh SchoolEmployedDebt consolidationD21.000
59398923,004.2600.15270320,958.37010.920FemaleSingleHigh SchoolEmployedBusinessC31.000
59399035,289.4300.1055593,257.24014.620MaleSingleBachelor'sEmployedDebt consolidationF51.000
59399147,112.6400.072675929.27014.130FemaleMarriedBachelor'sEmployedDebt consolidationC11.000
59399276,748.4400.06774016,290.4009.870MaleSingleBachelor'sEmployedDebt consolidationB21.000
59399348,959.5200.0967527,707.73010.310MaleMarriedHigh SchoolEmployedEducationB31.000

Test Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgrade
25455930,351.9400.06168317,247.50012.310MaleMarriedHigh SchoolEmployedDebt consolidationC3
25456029,792.1200.11164823,495.02014.630MaleMarriedHigh SchoolEmployedDebt consolidationD1
25456124,451.5600.08675228,920.2608.540FemaleMarriedBachelor'sEmployedBusinessB3
25456235,388.9100.16966127,300.01011.930FemaleMarriedHigh SchoolEmployedDebt consolidationD4
25456317,349.8500.09770412,596.91011.960FemaleSingleBachelor'sEmployedDebt consolidationC4
25456492,835.9700.06874429,704.00013.480FemaleSingleBachelor'sEmployedDebt consolidationB2
25456548,846.4700.09163420,284.3309.580FemaleMarriedHigh SchoolEmployedDebt consolidationD4
25456620,668.5200.09671826,387.5509.000MaleSingleMaster'sEmployedDebt consolidationC4
25456734,105.0900.09473911,107.3609.810MaleSingleBachelor'sEmployedBusinessC2
25456845,627.5300.11862419,246.14011.640FemaleMarriedHigh SchoolEmployedCarD3

Duplicate rows

Training Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgradeloan_paid_back# duplicates
Dataset does not contain duplicate rows.

Test Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgrade# duplicates
Dataset does not contain duplicate rows.